A classification of tasks in bioinformatics
نویسندگان
چکیده
MOTIVATION This paper reports on a survey of bioinformatics tasks currently undertaken by working biologists. The aim was to find the range of tasks that need to be supported and the components needed to do this in a general query system. This enabled a set of evaluation criteria to be used to assess both the biology and mechanical nature of general query systems. RESULTS A classification of the biological content of the tasks gathered offers a checklist for those tasks (and their specialisations) that should be offered in a general bioinformatics query system. This semantic analysis was contrasted with a syntactic analysis that revealed the small number of components required to describe all bioinformatics questions. Both the range of biological tasks and syntactic task components can be seen to provide a set of bioinformatics requirements for general query systems. These requirements were used to evaluate two bioinformatics query systems.
منابع مشابه
SFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy
In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....
متن کاملBioinformatics to Biostochastics: Statistical Perspectives and Tasks Ahead
Bioinformatics is an emerging field of science emphasizing the application of mathematics, statistics, and informatics to study and analysis of very large molecular biological (mostly, genetic and genomic) systems (data sets). In a comparatively broader setup of large biological systems without necessarily having a predominant genetic undercurrent, and having genesis in biometry to biostatistic...
متن کاملAutomatic classification of highly related Malate Dehydrogenase and L-Lactate Dehydrogenase based on 3D-pattern of active sites
Accurate protein function prediction is an important subject in bioinformatics, especially wheresequentially and structurally similar proteins have different functions. Malate dehydrogenaseand L-lactate dehydrogenase are two evolutionary related enzymes, which exist in a widevariety of organisms. These enzymes are sequentially and structurally similar and sharecommon active site residues, spati...
متن کاملPerform Three Data Mining Tasks with Crowdsourcing Process
For data mining studies, because of the complexity of doing feature selection process in tasks by hand, we need to send some of labeling to the workers with crowdsourcing activities. The process of outsourcing data mining tasks to users is often handled by software systems without enough knowledge of the age or geography of the users' residence. Uncertainty about the performance of virtual user...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 17 2 شماره
صفحات -
تاریخ انتشار 2001